38 research outputs found
Fast filtering and animation of large dynamic networks
Detecting and visualizing what are the most relevant changes in an evolving
network is an open challenge in several domains. We present a fast algorithm
that filters subsets of the strongest nodes and edges representing an evolving
weighted graph and visualize it by either creating a movie, or by streaming it
to an interactive network visualization tool. The algorithm is an approximation
of exponential sliding time-window that scales linearly with the number of
interactions. We compare the algorithm against rectangular and exponential
sliding time-window methods. Our network filtering algorithm: i) captures
persistent trends in the structure of dynamic weighted networks, ii) smoothens
transitions between the snapshots of dynamic network, and iii) uses limited
memory and processor time. The algorithm is publicly available as open-source
software.Comment: 6 figures, 2 table
Distinguishing Topical and Social Groups Based on Common Identity and Bond Theory
Social groups play a crucial role in social media platforms because they form
the basis for user participation and engagement. Groups are created explicitly
by members of the community, but also form organically as members interact. Due
to their importance, they have been studied widely (e.g., community detection,
evolution, activity, etc.). One of the key questions for understanding how such
groups evolve is whether there are different types of groups and how they
differ. In Sociology, theories have been proposed to help explain how such
groups form. In particular, the common identity and common bond theory states
that people join groups based on identity (i.e., interest in the topics
discussed) or bond attachment (i.e., social relationships). The theory has been
applied qualitatively to small groups to classify them as either topical or
social. We use the identity and bond theory to define a set of features to
classify groups into those two categories. Using a dataset from Flickr, we
extract user-defined groups and automatically-detected groups, obtained from a
community detection algorithm. We discuss the process of manual labeling of
groups into social or topical and present results of predicting the group label
based on the defined features. We directly validate the predictions of the
theory showing that the metrics are able to forecast the group type with high
accuracy. In addition, we present a comparison between declared and detected
groups along topicality and sociality dimensions.Comment: 10 pages, 6 figures, 2 table
Complex networks approach to modeling online social systems. The emergence of computational social science
This thesis is devoted to quantitative description, analysis, and modeling of complex social systems in the form of online social networks. Statistical patterns of the systems under study are unveiled and interpreted using concepts and methods of network science, social network analysis, and data mining. A long-term promise of this research is that predicting the behavior of complex techno-social systems will be possible in a way similar to contemporary weather forecasting, using statistical inference and computational modeling based on the advancements in understanding and knowledge of techno-social systems. Although the subject of this study are humans, as opposed to atoms or molecules in statistical physics, the availability of extremely large datasets on human behavior permits the use of tools and techniques of statistical physics. This dissertation deals with large datasets from online social networks, measures statistical patterns of social behavior, and develops quantitative methods, models, and metrics for complex techno-social systemsLa presente tesis está dedicada a la descripciĂłn, análisis y modelado cuantitativo de sistemas complejos sociales en forma de redes sociales en internet. Mediante el uso de mĂ©todos y conceptos provenientes de ciencia de redes, análisis de redes sociales y minerĂa de datos se descubren diferentes patrones estadĂsticos de los sistemas estudiados. Uno de los objetivos a largo plazo de esta lĂnea de investigaciĂłn consiste en hacer posible la predicciĂłn del comportamiento de sistemas complejos tecnolĂłgico-sociales, de un modo similar a la predicciĂłn meteorolĂłgica, usando inferencia estadĂstica y modelado computacional basado en avances en el conocimiento de los sistemas tecnolĂłgico-sociales. A pesar de que el objeto del presente estudio son seres humanos, en lugar de los átomos o molĂ©culas estudiados tradicionalmente en la fĂsica estadĂstica, la disponibilidad de grandes bases de datos sobre comportamiento humano hace posible el uso de tĂ©cnicas y mĂ©todos de fĂsica estadĂstica. En el presente trabajo se utilizan grandes bases de datos provenientes de redes sociales en internet, se miden patrones estadĂsticos de comportamiento social, y se desarrollan mĂ©todos cuantitativos, modelos y mĂ©tricas para el estudio de sistemas complejos tecnolĂłgico-sociales
Resilience of Supervised Learning Algorithms to Discriminatory Data Perturbations
Discrimination is a focal concern in supervised learning algorithms
augmenting human decision-making. These systems are trained using historical
data, which may have been tainted by discrimination, and may learn biases
against the protected groups. An important question is how to train models
without propagating discrimination. In this study, we i) define and model
discrimination as perturbations of a data-generating process and show how
discrimination can be induced via attributes correlated with the protected
attributes; ii) introduce a measure of resilience of a supervised learning
algorithm to potentially discriminatory data perturbations, iii) propose a
novel supervised learning algorithm that inhibits discrimination, and iv) show
that it is more resilient to discriminatory perturbations in synthetic and
real-world datasets than state-of-the-art learning algorithms. The proposed
method can be used with general supervised learning algorithms and avoids
inducement of discrimination, while maximizing model accuracy.Comment: 17 pages, 10 figures, 1 tabl
Estimating community feedback effect on topic choice in social media with predictive modeling
Social media users post content on various topics. A defining feature of social media is that other users can provide feedback—called community feedback—to their content in the form of comments, replies, and retweets. We hypothesize that the amount of received feedback influences the choice of topics on which a social media user posts. However, it is challenging to test this hypothesis as user heterogeneity and external confounders complicate measuring the feedback effect. Here, we investigate this hypothesis with a predictive approach based on an interpretable model of an author’s decision to continue the topic of their previous post. We explore the confounding factors, including author’s topic preferences and unobserved external factors such as news and social events, by optimizing the predictive accuracy. This approach enables us to identify which users are susceptible to community feedback. Overall, we find that 33% and 14% of active users in Reddit and Twitter, respectively, are influenced by community feedback. The model suggests that this feedback alters the probability of topic continuation up to 14%, depending on the user and the amount of feedback
Demographic Inference and Representative Population Estimates from Multilingual Social Media Data
Social media provide access to behavioural data at an unprecedented scale and granularity. However, using these data to understand phenomena in a broader population is difficult due to their non-representativeness and the bias of statistical inference tools towards dominant languages and groups. While demographic attribute inference could be used to mitigate such bias, current techniques are almost entirely monolingual and fail to work in a global environment. We address these challenges by combining multilingual demographic inference with post-stratification to create a more representative population sample. To learn demographic attributes, we create a new multimodal deep neural architecture for joint classification of age, gender, and organization-status of social media users that operates in 32 languages. This method substantially outperforms current state of the art while also reducing algorithmic bias. To correct for sampling biases, we propose fully interpretable multilevel regression methods that estimate inclusion probabilities from inferred joint population counts and ground-truth population counts.
In a large experiment over multilingual heterogeneous European regions, we show that our demographic inference and bias correction together allow for more accurate estimates of populations and make a significant step towards representative social sensing in downstream applications with multilingual social media
Social features of online networks: the strength of intermediary ties in online social media
An increasing fraction of today social interactions occur using online social
media as communication channels. Recent worldwide events, such as social
movements in Spain or revolts in the Middle East, highlight their capacity to
boost people coordination. Online networks display in general a rich internal
structure where users can choose among different types and intensity of
interactions. Despite of this, there are still open questions regarding the
social value of online interactions. For example, the existence of users with
millions of online friends sheds doubts on the relevance of these relations. In
this work, we focus on Twitter, one of the most popular online social networks,
and find that the network formed by the basic type of connections is organized
in groups. The activity of the users conforms to the landscape determined by
such groups. Furthermore, Twitter's distinction between different types of
interactions allows us to establish a parallelism between online and offline
social networks: personal interactions are more likely to occur on internal
links to the groups (the weakness of strong ties), events transmitting new
information go preferentially through links connecting different groups (the
strength of weak ties) or even more through links connecting to users belonging
to several groups that act as brokers (the strength of intermediary ties).Comment: 14 pages, 18 figure
Complex Networks approach to modeling online social systems: The emergence of computational social science
Tesis doctoral presentada por Przemyslaw A. Grabowicz para optar al tĂtulo de Doctor, en el Programa de FĂsica del Departamento de FĂsica de la Universitat de les Illes Balears, realizada en el IFISC.This thesis is devoted to quantitative description, analysis, and modeling of complex social systems in the form of online social networks. Statistical patterns of the systems under study are unveiled and interpreted using concepts and methods of network science, social network analysis, and data mining. A long-term promise of this research is that predicting the behavior of complex techno-social systems will be possible in a way similar to contemporary weather forecasting, using statistical inference and computational modeling based on the advancements in understanding and knowledge of techno-social systems. Although the subject of this study are humans, as opposed to atoms or molecules in statistical physics, the availability of extremely large datasets on human behavior permits the use of tools and techniques of statistical physics. This dissertation deals with large datasets from online social networks, measures statistical patterns of social behavior, and develops quantitative methods, models, and metrics for complex techno-social systems.This dissertation has been developed thanks to the support of CSIC JAE Predoc program and research projects of the Institute of Interdisciplinary Physics and Complex Systems in Palma de Mallorca.Peer Reviewe
Complex networks approach to modeling online social systems. The emergence of computational social science
This thesis is devoted to quantitative description, analysis, and modeling of complex social systems in the form of online social networks. Statistical patterns of the systems under study are unveiled and interpreted using concepts and methods of network science, social network analysis, and data mining. A long-term promise of this research is that predicting the behavior of complex techno-social systems will be possible in a way similar to contemporary weather forecasting, using statistical inference and computational modeling based on the advancements in understanding and knowledge of techno-social systems. Although the subject of this study are humans, as opposed to atoms or molecules in statistical physics, the availability of extremely large datasets on human behavior permits the use of tools and techniques of statistical physics. This dissertation deals with large datasets from online social networks, measures statistical patterns of social behavior, and develops quantitative methods, models, and metrics for complex techno-social systems.La presente tesis está dedicada a la descripciĂłn, análisis y modelado cuantitativo de sistemas complejos sociales en forma de redes sociales en internet. Mediante el uso de mĂ©todos y conceptos provenientes de ciencia de redes, análisis de redes sociales y minerĂa de datos se descubren diferentes patrones estadĂsticos de los sistemas estudiados. Uno de los objetivos a largo plazo de esta lĂnea de investigaciĂłn consiste en hacer posible la predicciĂłn del comportamiento de sistemas complejos tecnolĂłgico-sociales, de un modo similar a la predicciĂłn meteorolĂłgica, usando inferencia estadĂstica y modelado computacional basado en avances en el conocimiento de los sistemas tecnolĂłgico-sociales. A pesar de que el objeto del presente estudio son seres humanos, en lugar de los átomos o molĂ©culas estudiados tradicionalmente en la fĂsica estadĂstica, la disponibilidad de grandes bases de datos sobre comportamiento humano hace posible el uso de tĂ©cnicas y mĂ©todos de fĂsica estadĂstica. En el presente trabajo se utilizan grandes bases de datos provenientes de redes sociales en internet, se miden patrones estadĂsticos de comportamiento social, y se desarrollan mĂ©todos cuantitativos, modelos y mĂ©tricas para el estudio de sistemas complejos tecnolĂłgico-sociales